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Abstract — Consider a waveform channel where the transmitted 
signal is corrupted by Wiener phase noise and additive white 
Gaussian noise (AWGN). A discrete-time channel model that 
takes into account the effect of filtering on the phase noise 
is developed. The model is based on a multi-sample receiver 
which, at high Signal-to-Noise Ratio (SNR), achieves a rate that 
grows logarithmically with the SNR if the number of samples 
per symbol grows with the square-root of the SNR. Moreover, 
the pre-log factor is at least 1/2 in this case. 

I. Introduction 

Phase noise is an impairment that often arises in coherent 
communication systems. Different models are adopted for the 
phase noise process depending on the application. In Q], 
Katz and Shamai studied a discrete-time model of a phase 
noise channel (partially coherent channel) in which the phase 
noise is independent and identically distributed (i.i.d.) with 
a Tikhonov distribution. This model is reasonable for the 
residual phase error of a phase-tracking scheme, such as a 
Phase-Locked Loop (PLL). In (2, the authors investigate 
white (Gaussian) phase noise for which they observed a "spec- 
tral loss" phenomenon. The white phase noise approximates 
the nonlinear effect of cross-phase modulation (XPM) in a 
Wavelength-Division Multiplexing (WDM) optical communi- 
cation system. Lapidoth studied in Q a discrete-time phase 
noise channel 



Y k = X k e 



j0 k 



N k 



(1) 



at high SNR, where {Y k } is the output, {X k } is the input, 
{0/c} is the phase noise process and {N k } is the additive 
noise. He considered both memoryless phase noise and phase 
noise with memory. He showed that the capacity grows 
logarithmically with the SNR with a pre-log factor 1/2, where 
the pre-log is due to amplitude modulation only. The phase 
modulation contributes a bounded number of bits only. 

In this paper, we study a communication system in which 
the transmitted waveform is corrupted by Wiener phase noise 
and AWGN. The model is 



r(t) = x(t) exp(jd(t)) + n(t), for t e 



(2) 



where x(t) and r(t) are the transmitted and received signals, 
respectively, while n(t) and 9(t) are the additive and phase 
noise, respectively. A detailed description of the model is 



given in Sec. [XT] One application for such a channel model 
is optical communication under linear propagation, in which 
the laser phase noise is a continuous-time Wiener process 
(see @| and references therein). Since the sampling of a 
continuous-time Wiener process yields a discrete-time Wiener 
process (Gaussian random walk), it is tempting to use the 
model (Q3 with {9} as a discrete-time Wiener process, but this 
ignores the effect of filtering prior to sampling. It was pointed 
out in [4 1 that "even coherent systems relying on amplitude 
modulation (phase noise is obviously a problem in systems 
employing phase modulation) will suffer some degradation due 
to the presence of phase noise". This is because the filtering 
converts phase fluctuations to amplitude variations. It is worth 
mentioning that filtering is necessary before sampling to limit 
the variance of the noise samples. 

The model ([T} thus does not fit the channel (0 and it is not 
obvious whether a pre-log 1/2 is achievable. The model that 
takes the effect of (matched) filtering into account is 



Y k = X k H k + N k 



(3) 



where {H k } is a fading process. The model (f3]l falls in the 
class of non-coherent fading channels, i.e., the transmitter 
and receiver have knowledge of the distribution of the fading 
process {H k }, but have no knowledge of its realization. For 
such channels, Lapidoth and Moser showed in J5] that, at high 
SNR, the capacity grows double-logarithmically with the SNR, 
when the process {H k } is stationary, ergodic, and regular. 

Rather than using a matched filter and sampling its output at 
the symbol rate, we use a multi-sample receiver, i.e., a filter 
whose output is sampled many times per symbol. We show 
that this receiver achieves a rate that grows logarithmically 
with the SNR if the number of samples per symbol grows 
with the square-root of the SNR. Furthermore, we show that 
a pre-log of 1/2 is achievable through amplitude modulation. 
In this paper, we study only rectangular pulses but we believe 
that the results hold qualitatively for other pulses. 

The paper is organized as follows. The continuous-time 
model is described in Sec.|II]and the discretization is described 
in Sec. [Hi] We derive a lower bound on the capacity in Sec. 
HVl and discuss our result in Sec. [V] Finally, we conclude the 
paper with Sec. |VT] 



II. Continuous-Time Model 

We use the following notation: j = \f—l , * denotes the 
complex conjugate, So is the Dirac delta function, [•] is the 
ceiling operator, 5R[ ] is the real part of a complex number, 
log(-) is the natural logarithm and we use X^ to denote the 
fc-tuple (X\,X2, ■ ■ ■ , X k ). Suppose the transmit-waveform is 
x(t) and the receiver observes 



r(f) = x(t) exp(j6»(t)) + n(t) 



(4) 



where nit) is a realization of a white circularly-symmetric 
complex Gaussian process Nif) with 



E [N{t)\ = 

E[N{t 1 )N*(t 2 )] = a 2 N 5 D (t 2 -h). 
The phase 6(t) is a realization of a Wiener process 6(t): 



e(t) = e(o) 



W{T)dT 



(5) 



(6) 



where 0(0) is uniform on [— 7T, n) and is a real Gaussian 
process with 



E [W{t)\ = 
E\W(h)W{t 3 )] 



2tt/3 fe(t 2 -ii). 



(7) 
(8) 



The processes ./V(i) and 0(i) are independent of each other 
and independent of the input as well. iVo = 2a 2 N is the 
single-sided power spectral density of the additive noise. 
The parameter fj is called the full-width at half-maximum 
(FWHM), because the power spectral density of e- 70 ^ has 
a Lorentzian shape, for which j3 is the full-width at half the 
maximum. The transmitted waveforms must satisfy the power 
constraint 



E 



\X(t)\ 2 dt 



< V 



(9) 



where T is the transmission interval. 

III. Discrete-Time Model 

Let (afi, x%, . . . , x n ) be the codeword sent by the trans- 
mitter. Suppose the transmitter uses a unit-energy rectangular 
pulse, i.e., the waveform sent by the transmitter is 

n 

x(t) = ^ x m g(t ~ (TO - l)T symbo l) (10) 

7/7 = 1 

where T sym boi is the symbol interval and 

/>\ _ J yj 1 /^symbol •> ^ t <C ^symbol? /i i\ 

9W ~ { 0, otherwise. { ' 

Let L be the number of samples per symbol (L > 1) and 
define the sample interval A as 



A 



symbol 



(12) 



The received waveform r(t) is filtered using an integrator 
over a sample interval to give the output signal 

y(t) = f r(r) dr. (13) 

Jt-A 

where y(t) is a realization of Y(t). The output Y(t) is sampled 
every A seconds which yields the discrete-time model: 

Y k =X [k/L] A e' Qk F k + N k (14) 

for k = 1, . . . , ui, where F fe = Y(kA), O k = 6((fc - 1)A), 
P feA 



1 
A 



j(e(r)-e fc) dr 



(15) 



and 



(fc-l)A 



A- A 



N k = / N(t) dr. (16) 

J(k-1)A 

The process {iV^} is an i.i.d. circularly-symmetric complex 
Gaussian process with mean and E[|7Vfc| 2 ] = erf] A while 
the process {9^} is the discrete-time Wiener process: 

e fc = e fc _i + w fc (17) 

where 9i is uniform on [— tt, tt) and {Wfe} is an i.i.d. real 
Gaussian process with mean and EUWfcl 2 ] = 2ir(3A. The 
process {F k } is an i.i.d. process. Moreover, {F k } and {Wfc} 
are independent of {N k } but not independent of each other. 
Equations (0 - (fTTb imply the power constraint 



1 



\X m \ 2 ] <P = VT S 



symbol ■ 



(18) 



IV. Lower Bound 

For the fcth input symbol X k we have L outputs, so it is 
convenient to group the L samples per symbol in one vector 
and define Y fe = Qf(k-i)L+\,Y{k-\)L+2, • ■ ■ > Y(k-i)L+L)- We 
further define Xa = \X\ and X<$, = ZX. We decompose the 
mutual information using the chain rule into two parts: 

WiY?) = /(X2 il ;Y?) + 7(XS >1 ;Y?|X2 jl ). (19) 

The first term represents the contribution of the amplitude 
modulation while the second term represents the contribution 
of the phase modulation. We focus on the amplitude contri- 
bution and use I(X£ x ; Y"|X^ x ) > to obtain the lower 
bound 

I{X^)>I{Xl x ,Yf). (20) 
Suppose that X\ x is i.i.d. Hence, we have 

n 



fc=i 



J2 H(X Aik ) - H(X Atk \Yl X k A -l) 
fc=i 

(c) n 

> ^/(X A , fc ;Y fc ) 

fc=l 

(d) 



> ^/(x A , fe; y fc ) 



(21) 



k=l 



where 



V k = J2\Y {k _ 1)L+e f 



(22) 



Step (a) follows from the chain rule of mutual information, 

(b) follows from the independence of X AA , X Ay 2, ■ ■ ■ , X A<n , 

(c) holds because conditioning does not increase entropy, and 

(d) follows from the data processing inequality. Since X A 1 is 
identically distributed, then V" is also identically distributed 
and we have, for k > 2, 



I(X A<k ; V k ) = I(X A>1 ; Vi). 



(23) 



In the rest of this section, we consider only one symbol 
(k = 1) and drop the time index. Moreover, we assume that 
^symbol = 1 for simplicity. By combining (l22l and (TPfl l. we 
have 

L 

V = Y^ (X A A 2 \F e \ 2 + 2X A A^[e^ x e' e 'F e Nl} + \N e \ 2 ) 

= X 2 A AG + 2X A AZ X + Z 
where G, Z\ and Z$ are defined as 



(24) 



L 



The second-order statistics of Z% and Zq are 
E[Zi] = Var[Zi 



E[G]</2 



E[Z ] 



Var [Z ] = a%A 



(25) 
(26) 
(27) 

(28) 



E[Zi(^b-E[Z ])]=Q. 

By using the Auxiliary-Channel Lower Bound Theorem in 
Sec. VI], we have 

I(X A ;V) > E[-logQ v (V)]+E[logQ vlXA (V\X A )] (29) 

where Qv\x a ( v \ x a) is an arbitrary auxiliary channel and 

Qv(v) = \ Px A (xA)Qv\x A (v\x A )dx A (30) 



where Px A {) is the true input distribution, i.e., Qv{ ) is 
the output distribution obtained by connecting the true input 
source to the auxiliary channel. E[-] is the expectation accord- 
ing to the true distribution. We choose the auxiliary channel 



Qv\x A {v\x A ) 



1 



exp 



It follows that 

E[-log(Q^ A (F|X A ))] 



N 



E 



(v 



x\A 



•n) 



(31) 



(v- 



X\A 



2 \2 
N I 



4X A A*a N 



log A + 1 log^rra 2 ,) + ~E[log(Xi)]. (32) 



By using d24"l) . we have 

(V - X%A - a 2 N ) 



= (X 2 A A(G - 1) + 2X A AZ 1 + (Z - a%)) 
= X\A 2 {G - I) 2 + 4X 2 A A 2 Z 2 + {Z - <j 2 n ) 2 
+ 4X A A 2 {G - l)Zt + 2X 2 A A(G - 1)(Z Q - a%) 
+ 4X A AZ 1 (Z -a 2 N ) (33) 
and hence, using the second-order statistics (l28l l. we have 
(V-X%A-a%)^ 



E 



4a 



4X A A 2 *% 
V^E[(G-l) 2 ]+iE[G] + ^|E 



1 



where we also used 

E[(G-l)Zi] =0. 
Substituting (O into and using E[G] < 1 yield 
n-log(Q v[XA (V\X A ))} 
< log A + 1 log(4^4) + iE[log(Xl)] 



(34) 



(35) 



4 E[(G - l)2]+ ^ + i E 



(36) 



It is convenient to define Xp = X\. We choose the input 



distribution 

Px P {x P ) 



J ex P (" 



i P -p mic ' 

A 



0, 



otherwise 



(37) 



where < P m ; n < P and A = P — P m - m , so that 

E[X P ] = E[X A ] = P. (38) 
It follows from ([30j> and 07) that 

Xp — i^min 

A V A 



f°° 1 
M u ) = J T ex P 



< exp(P min /A) F v (v) 



where 



and 



Qv\X P {v\xp) = Qv\X A (v\y/xp) 



(39) 
(40) 



F v (v) = J ~exp(-^) Q vlXp (v\x P )dxp. (41) 

The inequality ([55)) follows from the non-negativity of the 
integrand. By combining (l3TT l. (l40l . ( l4TT > and making the 
change of variables x — xpA, we have 

F v (v) 



oo —a/ (A A) 



1 



AA y/inxAa 
1 



exp 



4xAcr^ 



dx 



V'AA(AA + 4A ( t^) 
/ 2 



exp 



4A< 



'AT 



(42) 



where we used equation (140) in Appendix A of [7|: 



exp 



1 



— , exp . 

(a + b) \ b 




exp 



u- \u\\ l 




(43) 



Therefore, we have 

E[-\og{F v {V))] 



1 



log(A 2 (A 2 +4Aa2 r )) 



2A<j 2 n 



(a) 

> log(AA) + 

(6) 

> log(AA) 



[V-a* N ]-E[\V-o 
1 



4< 



2a%A 



[v-° 2 N ] 



4< 



(44) 



where (a) holds because the logarithmic function is monotonic 
and E[| • |] > E[-], and (6) holds because 

E[V - a 2 N ] 

= E[Xl]AE[G] + 2E[X A ]AE[Z 1 ] + E[Z Q ] - a 2 N 

= PAE[G] > 0. (45) 

The monotonicity of the logarithmic function and (f39l > yield 

E[- \og(Q v (V))} > E [- log (e p — /X F V (V)) 

± mm 



> log A + log A 



A 



(46) 



where the last inequality follows from d44V It follows from 
d29jl, (|36]l and gUl that 

I(X A ; V) > log A - ^ - \ log(47ra^) - ht[\og{X\)\ 



4 E[(G - l)2] -^ T 



— X-E 

4A 



1 



(47) 



Tf P 

11 rr 



P/2, then A = P - P n 



P/2 and we have 



E 



X T 



< 



P, 



2 
P 



(48) 



and 



EpogpTp)] 



-e- (x - A) / A log(a;)rfa: 



= log A + 



(b) 

< log A + 1 



e -(«-i) bg(u)du 



(49) 



where (a) follows by the change of variables u = x/A, and 
(6) holds because log(w) < u — 1 for all u > 0. Substituting 
into d47t . we obtain 

I(X A ;F)-ilogSNR> - 2 - ilog^) - _J_ 



isNR E [(G- l) 2 ] (50) 



where SNR = P/u 2 ^. Suppose L grows with SNR such that 
L = /3\/SNR~j . (51) 
Since A — l/L, then we have 

lim SNRA = oo and lim SNR A 2 = \ (52) 

SNR^oo SNR-!-oo j3 A 

which implies 

lim I(X A ; V)-\ log SNR > -2 - \ log(87r) - ^ 

SNR— foo Z Z oD 

(53) 



because (see Appendix) 



lim 



E[(G-1) 2 ] _ (nP) 2 



(54) 



a^o A 2 9 
By combining ( 1201 ), (CUT) . ( 1231 and d53b . we have 

Jim -I(X--Xi) - \ log SNR > -2- Ilog(87r) - ^. 
SNR^oo n 2 2 do 

(55) 

This shows that the information rate grows logarithmically at 
high SNR with a pre-log factor of 1/2. 

V. Discussion 

There is a wide literature on the design of receivers for the 
channel model (Q} with a discrete-time Wiener phase noise, 
e.g., see |8|, |9|, [ 1 1 and references therein. One may want to 
make use of these designs, which raises the following question: 
"when is it justified to approximate the non-coherent fading 
model Q with the discrete-time phase noise model ([l}?" Our 
result suggests that this approximation may be justified when 
the phase variation is small over one symbol interval (i.e., 
when the phase noise linewidth is small compared to the 
symbol rate) and also the SNR is low to moderate. It must be 
noted that the SNR at which the high-SNR asymptotics start 
to manifest themselves depends on the application. 

We remark that the authors of iTTTI treated on-off keying 
transmission in the presence of Wiener phase noise by using a 
double-filtering receiver, which is composed of an intermediate 
frequency (IF) filter, followed by an envelope detector (square- 
law device) and then a post-detection filter. They showed that 
by optimizing the IF receiver bandwidth the double-filtering 
receiver outperforms the single-filtering (matched filter) re- 
ceiver. Furthermore, they showed via computer simulation that 
the optimum IF bandwidth increases with the SNR. This is 
similar to our result in the sense that we require the number 
of samples per symbol to increase with the SNR in order to 
achieve a rate that grows logarithmically with the SNR. 

Finally, we remark that we have not computed the contribu- 
tion of phase modulation to the information rate. We believe 
that using the multi-sample receiver it is possible to achieve an 
overall pre-log that is larger than 1/2. This matter is currently 
under investigation. 



VI. Conclusion 

We studied a communication system impaired by Wiener 
phase noise and AWGN. A discrete-time channel model based 
on filtering and oversampling is considered. The model ac- 
counts for the filtering effects on the phase noise. It is shown 
that at high SNR the multi-sample receiver achieves rates that 
grow logarithmically with at least a 1/2 pre-log factor if the 
number of samples per symbol grows with the square-root of 
the SNR. 
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Appendix 

We discuss the limit in (f54l i. We express E[(G — l) 2 ] as 



E[(G - l) 2 ] = Var(G) + (E[G] - l) 2 



I Var(|F 1 | 



+ (E[|F 1 | 2 ]-l) 



(56) 



where the last equality follows from the definition of G in 
d25l l and because {Fk} is i.i.d. 

Next, we outline the steps for computing E[|Fi| 4 ] and 
E[|i 7 'i| 2 ]. Let M be a positive integer, c = (ci, . . . , cm) T be 
a constant vector, t = (t\,. . . ,tM) T be a non-negative real 
vector and 0(t) = (6(ti)-9(0), . . . , e(t M )~O(0)) T where 
Q(t) is defined in ©. We have 



E 



/ • • • / ex P (jc T 0(t))dt 



(a) _J_ 

(b) J_ 

A M 



E [exp(jc T 0(t))] dt 
J ■ J exp ^-ic T £(t)c^ dt 
■J cxp ^-yC T £(t)c^ du 



(57) 



where dt — dtnj ■ ■ ■ dt\ and E(t) is the covariance matrix of 
0(t) whose entries are given by 



Sy(t) = 2ir(3min{ti,tj}, for i,j = 1, . 



,M. 



(58) 



Step (a) follows from the linearity of expectation, (b) follows 
by using the characteristic function of a Gaussian random 
vector, and (c) follows from the transformation of variables 
t = u A . We define 



a = e~^ A 

and use M = 2 and c = (—1, 1) T in (l57l i to compute 

1 — log a 



(loga) 2 



(59) 



(60) 



We also have, using M = 4 and c = (-1, 1, —1, 1) T in 071 ). 

E[|F!| 4 ] (61) 
_ 783 - 784a + a 4 + 540 log a + 240a log a + 144(logo) 2 
~ 18(loga) 4 ' 

Computing the integrals is tedious but straightforward. Finally, 
it follows from d56j, and (|59]l - (EB that 

lim E[(G A - 1)2] = lim E f = M!. 

a^o A 2 v H ' a-s-i (log a) 2 9 



(62) 
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